Adaptive data processing scheme based on delay forecast

ABSTRACT

The present invention relates to a data processing circuitry and method of processing an input data pattern and out-putting an output data pattern after a processing delay which depends on a processing activity of the data processing circuitry, wherein the processing delay is estimated based on the input pattern and the processing is controlled in response to the estimated processing delay. The processing control may be a power control based on an activity monitoring or a clock control in a pipeline structure. Thereby, an efficient solution is provided to derive the current activity of the processing circuitry in order to dynamically adapt its operating conditions to its demands.

The present invention relates to a method and data processing circuitryfor processing an input data pattern and for outputting an output datapattern after a processing delay which depends on a processing activityof said data processing circuitry.

Integrated systems are being introduced into a range of applications toundertake comprehensive control functions. In general, strong dynamiccoupling between processes requires specific control. Even if theindividual processes are stable, the coupled processes might not be.Thus, the relationship between system architecture and controlperformance must be determined to ensure reliable operation with minimumperformance degradation and optimum power supply.

In order to reduce power waste in current integrated systems, a veryefficient solution is to somehow guess or deduce the current activity ofsuch a system in order to dynamically adapt its operation conditions,such as power supply and frequency, to its demands. In such a way thesystem can be supplied with the required power only, i.e. more power onhigh activity levels and less power on low activity levels.

Furthermore, in pipeline systems, the frequency of clock signals must beselected such that each stage of the pipeline processing structure hasenough time to complete its operation correctly in every workingcondition and with every input pattern. However, it is well known that ageneric pipeline stage produces its output with a delay which depends onthe current input patterns. Therefore, the standard pipelines strategyadopted in synchronous systems fails in exploiting this behavior.

It is therefore an object of the present invention to provide animproved data processing circuitry and processing control method, bymeans of which various operating conditions of integrated systems can bedynamically adapted to the current system activity.

This object is achieved by a data processing circuitry as claimed inclaim 1 and a processing control method as claimed in claim 12.

Accordingly, the processing delay is estimated on the basis of the inputdata pattern to obtain information about the system activity. Every timea new input pattern is received, the output pattern will be generatedafter a certain delay. This delay depends on the processing activityintroduced or induced by the new input pattern. It can thus be concludedthat the input pattern causing the greatest delay is most likely toproduce the maximum activity inside the module. The estimated activitycan then be used to optimize operation conditions or parameters such aspower supply, clock frequency or the like of the integrated system.

Consequently, a simple technique is provided which can be adopted evenin current system designs and which is scalable for systems of differentsize, to thereby increase system performance with respect to varioussystem parameters.

Moreover, due to the fact that the proposed estimation can beimplemented on top of any standard design, compatibility with standardtooling and standard design techniques can be achieved.

The estimation means may comprise a look-up table for storing theestimated processing delay. Alternatively, the estimation means maycomprise a programmable delay line which is programmed by the input datapattern. In the first case, the look-up table may be addressed by theinput data pattern to output the estimated processing delay. In thelatter case, the programmable delay line may be adapted to generate anoutput signal after expiry of the estimated processing delay.

The estimation means may be adapted to estimate the processing delaybased on a sequence of input data patterns. Thereby, a forecast of theactivity and its development is possible.

The control means may be arranged to derive the processing activity fromthe estimated delay, and to control the power supply of the dataprocessing circuitry in response to the derived processing activity. Thepower supply can thus be dynamically adapted to the operating conditionsof the system.

As an example for another operating parameter or condition, the controlmeans may be adapted to control the clock supply to the data processingcircuitry in response to the estimated processing delay. Thereby, theclock supply for each stage of a pipeline structure can be selectivelygated based on the processing delay of each stage. Thereby, eachpipeline stage has enough time to complete its operation correctly inevery working condition and with every input pattern. In particular, thecontrol means may be arranged to un-gate the clock supply if theprevious stage has produced a valid output signal and the followingstage has stored the output signal. The estimated processing delay maybe expressed as a number of cycles of the clock signal.

In the following, the present invention will be described in greaterdetail on the basis of preferred embodiments with reference to theaccompanying drawings, in which:

FIG. 1 shows a schematic block diagram of a data processing schemeaccording to a first preferred embodiment;

FIG. 2 shows an example of a feed-back control loop using the proposedprocessing scheme according to the first preferred embodiment;

FIG. 3 shows a schematic block diagram of a standard pipelining scheme;

FIG. 4 shows a schematic block diagram of a pipelining scheme accordingto a second preferred embodiment; and

FIG. 5 shows a more detailed diagram of a typical stage of the proposedpipelining scheme according to the second preferred embodiment.

A first preferred embodiment will now be described on the basis of aprocessing scheme as shown in FIG. 1.

According to FIG. 1, an input data pattern “i[i−1]” is supplied to ageneric logic module 20. Every time the generic logic module 20 receivesan input data pattern “i[i−1]” a new output pattern “o[i−1]” will begenerated after a certain delay. The generic logic module 20 may be anykind of data processing device or circuitry arranged to generate anoutput data pattern based on a supplied input data pattern.

According to the first preferred embodiment, a processing delay in thegeneric logic module 20 is estimated based on the input data pattern“i[i−1]” using a programmable memory device, such as a look-up table 30,in which estimated processing delays for the generic logic module 20have been stored. The look-up table 30 with the estimated delays can beeasily generated at design time of the logic circuitry based onsimulations of the generic logic module 20 under control. Thus, theinput data pattern “i[i−1]” is also supplied to the look-up table 30 soas to address the look-up table 30 which outputs a corresponding delayvalue “a[i−1]” or a corresponding activity value which is based on thedelay value “a[i−1]”. Furthermore, the processing scheme comprises alatch or register 10 which stores a new input data pattern “i[i]” inresponse to a clock signal “clk” supplied to a clock input of theregister 10. Thus, the new input data pattern “i[i]” can be storedduring the processing delay of the previous input data pattern “i[i−1]”.

As a modification or alternative to the above processing circuitry 100,the next or new input data pattern “i[i]” can be used to address thelook-up table 30 in order to generate the delay value “a[i−1]”. In thiscase, the delay value “a[i−1]” corresponds to the delay forecast for theprocessing of the next input pattern “i[i]” at the generic logic module20.

As another modification, the look-up table 30 may be replaced by aprogrammable delay line, which is programmed based on the current ornext input data pattern so as to output a signal after a predetermineddelay corresponding to the estimated delay value of the processing delayof the generic logic module 20.

Accordingly, the processing circuitry 100 is adapted to provide thedelay value “a[i−1]” as an additional output for performing activitymonitoring based on the delay forecast.

FIG. 2 shows an example of a feedback control loop using the aboveprocessing circuitry 100. According to FIG. 2, a power control unit 35is provided in the feedback loop which evaluates the delay value“a[i−1]” in order to generate a control output “c[i−1]” supplied to theprocessing circuitry 100 in order to control power supply to theindividual processing units, in particular to the generic logic module20. Thereby, the power supply or any other suitable operating conditionof the processing circuitry 100 can be controlled on the basis of theactivity of the generic logic module 20, determined based on the delayforecast.

The granularity or resolution of the monitoring process can be changedby changing the resolution of the values stored in the look-up table 30.Furthermore, it is noted that the delay value may be generated based ona sequence of input patterns “i[i]” . . . “i[i+n]”, wherein the register10 or the delay table 30 may be arranged to store a plurality ofsuccessive input patterns “i[i]” . . . “i[i+n]”, so as to evaluate thissequence. Such an evaluation may be based on a logic processing orcomparison of the successive input data patterns “i[i]” . . . “i[i+n]”.

In the following, a second preferred embodiment will be described as anexample of a dynamic adjustment of a clock signal of a pipelinestructure. Standard pipelining methods adopt a global clock thatcontrols all the processing elements, e.g. flip-flops, in every stage ofthe pipeline.

FIG. 3 shows a schematic diagram of a standard pipelining schemecomprising a plurality of pipeline stages A, B and C to which an inputsignal “i” is supplied and at the output of which an output signal “o”is generated after a predetermined number of clock cycles correspondingto the number of stages. Due to the concurrent parallel supply of theclock signal “clk” to the pipeline stages A, B and C, each stage isclocked at the same timing. Therefore, the frequency of the clock signal“clk” must be selected such that each pipeline stage has enough time tocomplete its individual operation.

FIG. 4 shows a schematic block diagram of a pipeline structure accordingto the second preferred embodiment. In the proposed pipeline structure,the data-dependent behavior in the synchronous pipeline circuitry isexploited on the basis of a delay forecast for each individual pipelinestage. The global clock signal “clk” is selectively gated for each stagein the pipeline depending on its current input pattern. Thus, if apipeline stage has not completed its operation correctly, a pipelineclock generator 40 is adapted to suppress or gate the respective supplyof the global clock signal “clk” until a valid output has been producedat the respective stage and the following stage has stored the newresult.

FIG. 5 shows a more detailed block diagram of an individual stage of theproposed pipeline structure shown in FIG. 4. According to FIG. 5, thepipeline clock generator 40 comprises a clock gate 41 and a delay table42 in which estimated delay values for specific input patterns of thecorresponding pipeline stage are stored. These delay values may havebeen obtained on the basis of simulations or measurements performed atdesign stage of the circuitry. Furthermore, the pipeline stage comprisesa flip-flop circuit which is set according to the input data pattern “i”in response to a local clock signal “l_clk” generated by the clock gate41 based on the global clock “clk” supplied thereto. At the output ofthe flip-flop circuit 51, the set input data pattern id are supplied toa logic module 50 which is adapted to generate a desired output patterno after a respective processing delay. The input data pattern id is alsosupplied to the look-up table 42 together with the local clock “l_clk”.Based on these input signals the look-up table 42 generates a readysignal “r[i]” after a delay which corresponds to the stored estimateddelay of the logic unit 50 for the current input data id.

The generated local clock “l_clk” is also output as a taken signal“t[i−1]” to the previous stage, and the ready signal “r[i−1]” of theprevious stage is supplied to the clock gate. Furthermore, the takensignal “t[i]” of the succeeding or following stage is supplied to theclock gate 41 of the present stage. If the logic module 50 operates on aclock basis, the local clock signal “l_clk” may be supplied to the logicunit 50 as well, as indicated by the broken arrow in FIG. 5.

The global clock “clk” is selectively gated for each stage in thepipeline depending on its current input data pattern id. The delay table42 receives as its inputs the gated local clock “l_ck” and the currentinput pattern id and produces a ready signal “r[i]”. This signal isasserted after a certain delay, which may be expressed as a number ofcycles of the global clock “clk”, to signal that the stage has produceda valid output. The estimated delay stored in the delay or look-up table42 depends on the current input pattern and may have been obtain duringthe circuit design based on simulations or measurements. Moreover, aprogramming functionality may be provided at a look-up table 42 so as toupdate the estimated delays to provide a flexible design. The readysignal “r[i]” is released when the gated local clock “l_ck” goes low.The clock gate circuit 41 un-gates or releases the global clock “clk”when the previous stage has produced a valid output, i.e. when the readysignal “r[i−1]” of the previous stage goes high, and the following stagehas stored the new result, i.e. the taken signal “t[i]” which indicatesthe local clock of the following stage is activated e.g. shows a pulse.

Accordingly, the gating or suppressing of the global clock “clk” isbased on the ready signal and the taken signal, which indicate whetherthe pipeline stage has enough time to complete its operation correctly.Thereby, each stage mimics the behavior of an asynchronous pipelinestage, but uses the global clock “clk” as a reference clock. In thisway, it is possible to retain the data-dependent behavior ofasynchronous systems as well as all the advantages of a synchronoussystem, such as testability, easy design, predictability, etc.

Optionally, the look-up table 42 may use the global clock “clk” togenerate the ready signal, as indicated by the broken arrow in FIG. 5.Furthermore, the clock gate circuit 41 may use different strategies ortimings in generating and detecting the taken signal. The proposedpipeline clock generator structure may be extended to provide adisabling function in a test and debug mode of the pipeline scheme.

Furthermore, in special circumstances, e.g. where input and output of astage are connected to the same unit, the taken signal may not berequired.

The clock gate circuit may be a simple logic circuit having the desiredgating functionality based on the ready signal “r[i−1]” of the previousstage and the taken signal “t[i]” of the following stage.

It is noted that the present invention is not restricted to the abovepreferred embodiments but can be applied in any data processingcircuitry in which a processing delay depends on the input pattern. Thepreferred embodiments may thus vary within the scope of the attachedclaims.

1. A data processing circuit for processing an input data pattern andfor outputting an output data pattern after a processing delay whichdepends on a processing activity of said data processing circuitry, saiddata processing circuitry comprising: a) estimation means for estimatingsaid processing delay based on said input data pattern; and b) controlmeans for controlling said processing by said data processing circuit inresponse to said estimated processing delay.
 2. A circuitry according toclaim 1, wherein said estimation means comprises a look-up table forstoring said estimated processing delay.
 3. A circuitry according toclaim 2, wherein said look-up table is addressed by said input datapattern to output said estimated processing delay.
 4. A circuitryaccording to claim 1, wherein said estimation means comprises aprogrammable delay line which is programmed by said input data pattern.5. A circuitry according to claim 4, wherein said programmable delayline is adapted to generate an output signal after expiry of saidestimated processing delay.
 6. A circuitry according to claim 1, whereinsaid estimation means is adapted to estimate said processing delay basedon a sequence of input data patterns.
 7. A circuitry according to claim1, wherein said control means is arranged to derive said processingactivity from said estimated delay, and to control power supply of saiddata processing circuitry in response to said derived processingactivity.
 8. A circuitry according to claim 1, wherein said controlmeans is adapted to control the clock supply to said data processingcircuitry in response to said estimated processing delay.
 9. A circuitryaccording to claim 8, wherein said data processing circuitry has apipeline structure and said control means is adapted to selectively gatesaid clock supply for each stage of said pipeline structure.
 10. Acircuitry according to claim 9, wherein said control means is arrangedto un-gate said clock supply if the previous stage has produced a validoutput signal and the following stage has stored said output signal. 11.A circuitry according to claim 1, wherein said estimated processingdelay is expressed as a number of cycles of said clock signal.
 12. Amethod of controlling processing of an input data pattern, wherein apredetermined output data pattern is generated after a processing delaywhich depends on an activity of said processing, said method comprisingthe steps of: a) estimating said processing delay based on said inputdata pattern; and b) performing said processing control in response tosaid estimated processing delay.
 13. A method according to claim 12,wherein said processing control is a power control based on an activitymonitoring.
 14. A method according to claim 12, wherein said processingcontrol is a control of a clock supply to a synchronous pipelinestructure.